We'll use some operators from Chapter 3. If you want to write images out or to see them as pictures, you'll need to normalise them:
From Chapter 4, we'll start using edge magnitude now, so we need the Sobel operator, and edge magnitude and direction
Feature Extraction & Image Processing,
M. S. Nixon and A. S. Aguado
Elsevier/ Butterworth Heinmann, 2nd Edition 2007
This worksheet is the companion to Chapter 5 and implements the basic feature extraction techniques described therein. The worksheet follows the text directly.
First, we require a routine to match a template to an image, template matching. This is implemented by template convolution using a match functional. For each point within the template, if the point value matches the template value, then the votes are increased by one. The total of votes is then stored in an accumulator array which collects evidence. The point in the accumulator array where the votes are stored is the current position of the template. Accordingly, the accumulator array is two dimensional in terms of spatial position. This is then used to find the position of a shape in an image: the template is the shape and the maximum value in the accumulator array gives the position where the template best matched the image. Our operator, t_match, is then:
The point of best match, the maximum in the accumulator space, is the starting point (upper left hand corner) of the template. That's why the position seems to be slightly out. It could have been plotted at the template's centre by adding half the size of the template to the co-ordinates of the maximum point.
Here, the peak is the point of the best match. The accumulator space shows that there was a some match elsewhere, between the template and the image. Note the large border space where no matching process occurs since the template would extend beyond the image.
We can also implement template matching by the Fourier transform. First, we need the template to be the same size as the image to which it is to be matched. We then need to pad out the template to be of the same size, inserting white at pixels where the template has no value (remember that our rectangle is black):
Multiplication in the Fourier transform domain is actually equivalent to convolving two images. However, we want to correlate them to work out the point of best match. If we flip one of the images (reversing both its axes), then multiplication of the transform of the flipped image by the transform of the original image then gives correlation, template matching. So we need a routine to flip the template:
Now we can multiply the transform of the image by the transform of the flipped and padded template to implement template matching in the frequency domain by taking transforms
Note that the border regions are now occupied. A basic assumption of the DFT is that images replicate to infinity, so the votes in the border regions could be interpreted as incorrect.
Note that we've only used the rearrange function for display purposes. Let's join all this together (but without the rearrange function) as the routine t_corr
Did you notice any improvement in speed (over t_match)? You should have done. The template is 8x8, which is sufficient size for the FFT algorithm to give advantage. Also, Mathcad's FFT routines are compiled, and so frequency domain implementations in Mathcad can appear supremely fast, but this is artifice.
By image domain calculation, this would require a massive computation. Let's do it by Fourier instead:
For the hardy amongst you, we'll do these bigger images by template matching in the image domain. It takes so long, that I switched it off. To make it go, select the operation (click on the rectangle after acc_space) and then press colon and off it'll go (as will you for a cuppa.....)
So far, we've only found the position of shapes. If we were required to find the shape irrespective of its size, then we would need to mach different sized templates to the image. In turn, this implies a three dimensional accumulator space, where the three dimensions are the spatial co-ordinates and the size. If we were to progress to orientation, to find a shape irrespective of its rotation, then we would increase the dimensions of the accumulator space by one more. Perspective transformation would increase it even more. These are only modifications to the basic scheme: they do not change its basis, only its implementation.
but it's still in the right place. This is one of the virtues of template matching. It finds shapes in the right place, even when there's a lot of noise. Let's see it again, for some varaitions in the noise level
OK, we've broken it- which is what we'd expect for such a noisy image. In fact, we might not have -it still wolrks occasionally. Just click on the equation generating the noisy image (noisy_picture6:=) and press F9. This gives a diferent instantiation of the noise. Occasionally it works, but not very often.
Further, we can still detect the shape when it is occluded (hidden) by another shape.For this, we shall use a function which draws a vertical grey bar on the image as
The good performance is because it is a version of matched filtering, delivering a result which is optimal by the least squares criterion. We could make the implementation more robust by using edge detection (thereby concentrating on a shape's borders), but we can only make it faster by Fourier implementation. The other way to make it faster is to notice that it is actually a mapping transform. This is the Hough Transform (HT). We'll start the HT analysis by looking at the HT for lines.
The basis of the HT for lines is the duality between a line in x,y co-ordinate space, and a line in m,c co-ordinate space. The equation for a line is:
y = mx + c
for points with co-ordinates x,y collinear with gradient m and intercepting the y axis for a value c. If we change this round we get
c = -xm + y
for points with co-ordinates c,m collinear with gradient -x and intercepting the c axis for a value y. So a point x,y maps to a line in m,c. The co-ordinate space m,c is the accumulator space; lines drawn in the accumulator space are lines of votes. So the co-ordinates of a point x,y give, for all possible values of m, a set of possible values for c. The location of each accumulator array point for all points m,c is incremented for each vote.
and we'll accumulate evidence via the HT for lines A_line then processes the edge image E_mag according to :
Let's synthesise an edge image as a perfect line of brightness value 255 by calling the line function
These are the values which we used to draw the original line. So it works! Try another line (by using different values for m and c), and check that it still works.
The major difficulty with the basic HT for lines is in definition of an appropriate parameter space. The one just used uses vales of m between 0 and the image size to calculate values of c. In fact, m can be infinite for a vertical line. This would imply a rather large parameter space. To avoid it we use the foot-of-normal parameterisation where lines are represented by
xcosq + ysinq = r
where x,y are the co-ordinates in image space and r,q are the co-ordinates of the accumulator space. This equation is used to replace the voting functional c = -xm + y. This gives A_polar
We'd better check this peak out! How can we tell if it's right (especially when there's another pretty close peak). We need a routine to draw lines by using polar format as
The result is fine. It's just that our line straddles two adjacent cells in the accumulator space. The solution is to make the quantisation of the accumulator space smaller
So let's move on to the HT for circles. This uses the same duality as the HT for lines: a circle of points with co-ordinates x,y centre x0,y0 radius r is
(x - x0)2 + (y - y0)2 = r2
which is also a circle with co-ordinates x0,y0 centre x,y radius r. So the equation again gives a voting functional, but this time in a three dimensional accumulator space, where the three dimensions are position (x,y) and the radius. Unfortunately, 3D matrices are not routine in Mathcad. We'll first need a routine to generate circles as:
So, for a fixed radius we require a 2D accumulator space (equivalent to a slice through the 3D space) giving the routine A_circle as
Now try different values of the radius - what happens to the peak? Also, have a look at the structure of the votes in the background, is it what you would expect?
Try other values for the threshold and the radius. Can you explain your results? How could you improve on the results you achieve?
We could now move on to the HT for ellipses. These are defined as
(x - x0)2 / a2 + (y - y0)2 / b2 = 1
which is an ellipse with co-ordinates x,y centre x0,y0 with major and minor axes a and b, respectively. This implies that we need a four dimensional accumulator space; if we include rotation we need a five dimensional accumulator space. That could be rather large. It's only a minor modification to the HT for circles anyway - the voting functional just draws ellipses of votes, rather than circles. So we'll now move on to a technique which is not restricted to conic sections such as lines, circles or ellipses.
The generalised HT can be used to find arbitrary shapes, shapes which are not expressed in closed analytic form, like lines and circles. Essentially it stores the voting functional as a look up table. The look up table maps image points to the accumulator space and finding a shape is then equivalent to finding the peak in the accumulator array. The look-up table depends on edge direction. The edge direction at a point of the template shape indexes the distance to, and direction of, a reference point. When analysing a target shape, the edge direction at a point is used to vote for possible centre points. Consider finding a shape like a card-suit spade in an image. The template of the spade is:
We now need to make the look-up table. This is actually called the R-table. It needs a reference point, we'll use a function to calculate the centre co-ordinates of our template shape.
We'll initialise the accumulator array cells to zero, the number of bins is the number of possible values of edge direction
Let's apply it. We'll add some noise to our template, and then shift and then try and find it again. First we'll add some noise to a shifted image:
impressive eh! But remember, it's a result (nearly) equivalent to template matching. To run it again, with a different selection of noise points, just select the function noisy_template := addcondiments() and run it again. Each time, you'lll get a different pattern of noise. Check the extraction still works.
This is pretty close. Out of interest, let's see the result by template matching. We're matching 31x31 images, so we'll use Fourier. We'll need to pad out our image (for the FFT) to 32x32 so we'll need pointers
This is the co-ordinates of the top left hand corner. The centre is at 10,10 so this again gives the right answer. The difference between template matching and the GHT is that the GHT uses edge direction, and a discrete set of bins. If the Fourier implementation of template matching was not artificially fast, you'd see no speed advantage by template matching.
Not such a bad result, the adjacent peak is not quite so high, compared with the peak. Look at a printout to see the dispersion.
So we can now find arbitrary shapes too, in a manner providing a result which is equivalent to template matching, but faster. This is where we will leave evidence gathering techniques for the moment, and return to them later. We'll now move to techniques which evolve to the solution, rather than find it by post-processing an accumulator array.